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ABSTRACT 


Most  reliability  analysis  techniques  and  tools  assume  that  a  system  is  used  for  a  mission  consisting  of  a 
single  phase.  However,  multiple  phases  are  natural  in  manv  missions.  The  failure  rates  of  components,  system 
configuration,  and  success  criteria  may  vary  from  phase  to  phase.  In  addition,  the  duration  of  a  phase  may  be 
deterministic  or  random.  Recently,  several  researchers  have  addressed  the  problem  of  reliability  analysis  of 
such  systems  using  a  variety  of  methods.  We  describe  a  new  technique  for  phased-mission  system  reliability 
analysis  based  on  Boolean  algebraic  methods.  Our  technique  is  computationally  efficient  and  is  applicable 
to  a  large  class  of  systems  for  which  the  failure  criterion  in  each  phase  can  be  expressed  as  a  fault  tree  (or 
an  equivalent  representation).  Our  technique  avoids  state  space  explosion  that  commonly  plague  Markov 
chain-based  analysis.  We  develop  a  phase  algebra  to  account  for  the  effects  of  variable  configurations  and 
success  criteria  from  phase  to  phase.  Our  technique  yields  exact  (as  opposed  to  approximate)  results.  We 
demonstrate  the  use  our  technique  by  means  of  an  example  and  present  numerical  results  to  show  the  effects 
of  mission  phases  on  the  system  reliability. 
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1  Introduction 


The  reliability  analysis  of  ultra^reliable  computer  systems  is  an  important  problem  for  which  various  tech¬ 
niques  and  tools  have  been  developed  [l]-[4].  Most  analysis  techniques  assume  that  the  systems  operate  in 
single-phase  missions.  However,  multiple  phases  are  natural  in  many  applications.  The  system  configuration, 
operational  requirements  for  individual  components,  the  success  criteria,  and  the  stress  on  the  components 
(and  thus  the  failure  rates)  may  vary  from  phase  to  phase.  For  example,  fault  tolerant  systems  may  consist  of 
multiple  subsystems  employing  redundancy  and  may  have  dedicated  or  pooled  spares.  A  dedicated  spare  can 
replace  only  a  single  preassigned  function.  A  pooled  spare,  on  the  other  hand,  has  the  capability  of  replacing 
any  of  the  several  functions  in  the  system.  Depending  on  the  requirements  during  different  phases,  spares 
may  be  placed  in  service  or  removed  from  service  to  badance  the  system  reliability  and  the  cost  of  operation. 
The  success  of  a  redundancy  management  scheme  defines  if  a  system  is  operational  or  not.  The  usage  of 
subsystems  may  also  vary  from  phase  to  phase  and  subsystems  supporting  those  services  may  remain  idle 
or  may  be  switched  off.  Furthermore,  the  duration  of  any  phase  may  be  deterministic  or  random.  All  these 
variations  affect  the  system  reliability. 

Sometimes  the  effects  of  phased  missions  can  be  ignored  in  favor  of  simpler  analysis.  For  example,  in  an 
airplane  system,  landing  gear  and  its  associated  control  subsystems  are  not  required  during  cruising  phase. 
So  exact  analysis  should  not  ignore  such  failures.  But,  continuing  to  count  the  failure  of  landing  gear  during 
cruising  phase  has  very  little  impact  on  the  overall  unreliability  and  may  simplify  the  computation.  However, 
most  of  the  time  only  conservative  estimates  can  be  made,  thus  yielding  the  worst  case  unreliability  of  the 
system.  One  adverse  effect  of  this  is  that  the  systems  are  over-designed.  For  economic  reasons,  it  may  be 
desirable  to  perform  more  accurate  analysis.  In  particular,  if  one  phase  may  see  much  more  stress  than  others 
then  it  is  necessary  to  account  for  these  effects  properly.  It  is  not  accurate  to  use  conservative  paruneters  for 
the  the  entire  mission.  On  the  other  hand  the  impact  of  a  phase  with  severest  parameter  vrdues  must  not  be 
ignored  in  emalysis.  Different  aspects  of  phased-mission  systems  have  been  discussed  by  several  researchers. 


A 
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Figure  1:  The  three  units  in  a  system 


To  describe  and  compare  the  work  here  of  others  and  our  own,  we  will  use  a  three  component  system  as 
an  example.  Components  A,  B,  and  C  are  used  in  a  system  which  is  employed  in  a  mission  with  3  phases. 
The  phases  are  denoted  as  Phase  X,  Phase  Y,  and  Phase  Z,  respectively.  To  show  the  effect  of  phased-mission 
analysis  we  will  consider  all  six  permutation  of  these  three  phases.  That  is,  we  will  a^ume  that  the  mission 
may  go  through  the  three  phases  in  any  order.  So  one  particular  order  may  be  Phases  X,  Y,  and  Z  or  another 
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PHASE  X  PHASE  Y  PHASE  Z 


Figure  2:  The  success  criteria  for  phases  expressed  using  fault  trees 


could  be  Phases  Z,  Y,  and  X.  The  success  criteria  for  each  of  the  three  phases  is  expressed  using  fault  trees 
as  shown  in  Figure  2.  In  Phase  X,  the  system  fails  if  any  of  the  components  A,  B,  or  C  fails.  In  Phase  Y, 
the  system  fails  if  component  A  fails  or  both  of  the  components  B  and  C  fail.  In  Phase  Z,  the  system  fails 
if  all  three  components  fail.  The  failure  rates  of  three  components  are  A^,  A),  and  Ae,  respectively. 

The  corresponding  Markov  chains  for  all  phases  are  shown  in  Figure  3.  In  the  Markov  chain  representar 
tion,  a  3-tuple  represents  a  state  indicating  the  status  of  the  three  components  respectively.  A  “1”  represents 
that  the  corresponding  component  is  alive  and  a  “0”  represent  that  the  component  has  failed.  For  example, 
a  state  (101)  implies  that  component  B  has  failed  and  the  other  two  components  are  alive.  A  transition  from 
one  state  to  another  state  has  a  rate  associated  with  it  which  b  the  failure  rate  of  the  component  that  faib. 
For  example,  a  transition  from  state  (Oil)  to  state  (010)  has  a  transition  rate  of  Ac.  States  marked  F  are 
fiuled  states. 


2  Related  Work 

Esary  and  Ziehms  [5]  dbcuss  analysb  of  multiple  configuration  systems  during  different  phases  of  a  mission  to 
accomplbh  specified  goab.  In  their  approach,  each  phase  of  a  system  b  modeled  using  a  separate  reliability 
block  diagram  (RBD).  For  phase  p,  a  component  C  b  represented  by  a  series  of  a  blocks  Ci,  Cj,  •  ■  • ,  where 
Ci  represents  the  probability  of  failure  (or  success)  associated  with  component  C  in  a  phase  t  and  depends 
on  the  failure  rate  of  that  component  during  that  phase.  All  phase  RBDs  are  connected  in  series  as  shown 
in  Figure  4  for  a  three  phase  system  using  three  components.  Solution  of  thb  RBD  correctly  predicts  the 
reliability  of  the  three  phase  system.  The  problem  with  thb  approadi  b  a  large  RBD  with  several  common 
events,  the  solution  of  which  may  be  computationally  very  expensive.  Each  component  generates  p  basic 
event  for  a  p-phased  system.  A  k  component  system  will  thus  have  k*p  basic  events  and  obtaining  cut 
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Maikov  Chain  fw  Phase  X 


Mariiov  Chain  f(V  Phase  Z 

Figure  3:  The  Markov  chains  for  three  phases 


PHASE  X  PHASE  Y  PHASE  Z 


Figure  4:  Reliability  block  diagram  for  a  three  phases  system  with  variable  configuration 


Markov  Chains  for  Phases  X.  Y.  and  Z  and  mapping  between  them 


Figure  5:  The  multi-phase  Markov  chain 

sets  after  accounting  for  common  events  will  be  expensive.  Approximate  solution  to  RBD  may  include  large 
errors  due  to  multiple  common  events. 

Pedar  and  Sarma  [6]  carry  out  phased-mission  analysis  of  an  aerospace  computing  systems  using  an 
approach  similar  to  Esary  and  Ziehms.  They  developed  a  procedure  to  systematically  cancel  out  the  common 
events  in  earlier  phases  which  are  accounted  for  in  later  phases.  Alam  and  Al-Saggaf  [7]  developed  a  technique 
to  analyze  repairable  systems  in  which  system  success  criteria  and  failure  rates  of  components  may  vary  from 
phase  to  phase. 

Smotherman  and  Zemoudeh  [9]  use  a  non-homogeneous  Markov  model  to  carry  out  a  phased-mission 
system  analysis.  They  represent  the  behavior  of  the  system  in  each  phase  using  a  different  Markov  chain 
and  each  pheise  is  represented  by  a  separate  subset  of  the  states.  The  state  transitions,  which  are  described 
in  terms  of  random  variables,  are  generalized  to  include  phase  changes.  Therefore,  state  dependent  phase 
changes,  random  phase  durations,  time  varying  failure  and  repair  behavior  are  readily  modeled.  A  complete 
Markov  chain  of  a  three  phase  system  of  Figure  2  with  phase  order  of  X,  Y,  and  Z  is  shown  in  Figure  5.  The 
major  drawback  of  this  approach,  like  Esary  and  Ziehms  approach  using  RBDs,  is  a  huge  non-homogeneous 
Markov  chain.  The  size  of  the  state  space  is  as  big  as  the  sum  of  the  number  of  states  in  each  of  the  individual 
phase!  This  requires  large  amount  of  storage  and  computation  time  to  solve  a  system  limiting  the  kind  of 
systems  that  can  be  analyzed. 

Somani  et.  al.  [10]  presented  a  computationally  efficient  method  to  analyze  multi-phased  systems  and  a 
new  software  tool  for  reliability  analyses  of  such  systems.  A  system  with  variable  configuration  and  success 
criteria  results  in  different  Markov  chains  for  different  phases  as  shown  in  Figure  5.  In  Somani  et.  al.’s 
approach,  instead  of  a  single  Markov  chain,  Markov  chains  for  individual  ph2tses  are  developed  and  solved 
separately.  The  issue  of  varying  success  criteria  and  change  in  system  configuration  from  phase  to  phase 


F(1J.3’) 


F(U.3-) 


F(2J) 


F03) 


Figure  6;  Two  scenarios  for  phased-mission  systems  with  variable  configuration 


is  addressed  by  providing  an  efficient  mapping  procedure  at  the  transition  time  from  a  phase  to  another 
phase.  While  analyzing  a  phase,  only  the  states  relevant  to  that  phase,  are  considered.  Thus  each  individual 
Markov  chain  is  much  smaller  than  in  Smotherman  and  Zemoudeh  [9].  For  example,  in  Figure  5,  three  Markov 
chains  with  number  of  states  2,  4,  and  8,  respectively  are  solved  instead  of  a  single  Markov  chain  with  12 
states.  Using  this  approach,  the  computation  time  for  large  systems  can  be  reduced  significantly  without 
compromising  accuracy.  Phases  may  be  of  a  fixed  or  a  random  duration.  The  reliability  (or  unreliability) 
of  the  system  can  be  computed  from  the  output  of  final  phase.  Furthermore,  the  technique  is  sufficiently 
general. 

Using  a  similar  approach,  Dugan  [8]  suggested  another  method  in  which  a  single  Markov  chain  with  state 
space  equal  to  the  union  of  the  state  spaces  of  the  individual  phases  is  generated.  The  transitions  rates  are 
parameterized  with  phase  numbers  and  the  Markov  chain  is  solved  p  times  for  p  phases.  The  final  state 
occupation  probabilities  of  one  phase  become  the  initial  state  occupation  probabilities  for  the  next  phase. 
In  her  approach,  once  a  state  is  declared  a  system  down  state  in  a  phase,  it  cannot  become  an  up  state  in 
a  later  phase.  This  is  a  potential  problem  as  it  is  possible  for  a  system  to  have  some  states  that  are  failure 
states  in  a  phase  but  are  up  states  in  a  later  phase.  For  example,  consider  the  two  scenarios  as  shown  in 
Figure  6.  In  the  first  ceise  (Figure  6a),  phase  order  is  Phase  X,  Phase  Y,  and  Phase  Z.  In  this  case,  some  of 
the  states  are  failure  states  in  the  first  phase  that  are  later  on  treated  as  forced  failure  states  although  they 
are  not  failure  states  in  phases  2  and  3.  Such  states  are  marked  as  F(l,2’,3’)  or  F(l,2,3’).  In  the  second  case, 
phase  order  is  Phase  Z,  Phase  Y,  and  Phase  X.  In  this  case,  there  are  no  forced  failure  states. 

In  this  paper,  we  present  a  methodology  to  analyze  and  solve  phased-mission  systems  in  which  failure 
rates,  configuration  and  success  criteria  can  vary  from  phase  to  phase.  Moreover,  the  success  criteria  can 
be  specified  using  fault  trees  or  an  equivalent  representation.  We  believe  that  a  majority  of  systems  can  be 
represented  using  fault  trees.  Our  approach  is  similar  to  Esary  and  Ziehms’  in  that  we  do  not  generate  any 
Markov  chains,  but  in  addition  we  do  not  create  a  single,  monolithic  model.  We  handle  one  phaise  at  a  time 
and  then  compute  the  overall  unreliability  of  the  entire  mission.  This  gives  us  a  computational  advantage. 


5 


First  we  describe  some  concepts  which  we  will  use  throughout  the  paper. 


3  Dbtribution  Functions  with  Mass  at  Origin 

One  of  the  key  concepts  we  will  use  in  our  method  is  that  of  cumulative  distribution  functions  with  a  mass 
at  the  origin.  Consider  a  random  variable  X  with  cumulative  distribution  function  given  by 

Fx{t)  =  (1  -  +  <■*'^•(1  - 

This  function  has  a  mass  at  the  origin  given  by  P{X  =  0)  =  (1  —  .  The  second  term  represents  the 

continuous  part  of  the  distribution  function. 

In  order  to  illustrate  the  use  of  such  a  CDF,  consider  a  component  with  a  failure  rate  of  A  that  is  used 
in  a  phased  mission  system.  Assume  that  the  system  has  just  completed  one  phase  of  duration  Ti  and  is 
currently  in  the  second  phase.  The  above  CDF  can  be  assigned  as  the  failure  probability  distribution  of 
the  component  in  the  second  phase.  The  first  term  in  the  above  expression  represents  the  probability  that 
the  component  has  already  failed  in  the  previous  phase.  The  second  term  represents  the  failure  probability 
distribution  for  this  component  for  the  second  phase.  The  time  origin  for  the  second  phase  is  reinitialized 
to  the  beginning  of  the  phase.  We  will  use  such  distribution  functions  to  represent  failure  probabilities  of 
individual  components  during  different  phases. 


4  Phased-Mission  Analysis:  Phase  Independent  Success  Criteria 

In  this  section  we  consider  a  simpler  scenario,  a  phased-mission  system  in  which  the  success  criterion  is  phase 
independent.  Therefore,  the  system  configuration  and  the  success  criteria  remains  unchanged  from  phase  to 
phase  and  can  be  represented  by  the  same  fault  tree  for  all  phases.  However,  component  failure  rates  are 
allowed  to  be  phase  dependent.  We  first  assume  that  phase  durations  are  deterministic.  We  will  relax  these 
constraints  one  at  a  time  in  the  following  subsections. 

4.1  Phase-Dependent  FWlure  Rates 

To  account  for  phase-dependent  failure  rates,  we  assign  a  failure  distribution  with  mass  at  the  origin  to  each 
component.  Let  Xji  represent  the  failure  rate  of  component  j  in  phase  t.  For  component  j,  the  distribution 
function  assigned  in  phase  k  is  given  by 

=  (1  -  e"  +  e- (1) 

Here  time  t  is  measured  from  the  beginning  of  phase  k  so  that  0  <t  <Ti,.  Ti  represents  the  duration  for 
phase  i.  This  expression  can  be  simplified  to:  Fc,.*(<)  =  1  —  e“*>‘‘[e“  At  the  end  of  phase  k, 
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at  (  —  Tm,  the  above  expression  gives  the  mass  at  the  origin  for  phase  ir  +  1.  A  component  fails  during  a 
phase  only  if  it  survives  during  all  the  previous  phases.  The  factor  enclosed  in  square  brackets  above  is  the 
probability  of  success  during  first  k  —  l  phases.  Since  the  success  criteria  is  same  in  all  phases,  a  system  fails 
by  phase  k  if  it  fails  any  time  during  the  first  k  phases.  We  can  obtain  the  unreliability  of  the  system  at 
time  0  <  t  <Tk  during  phase  1  <  k  <  m  by  evaluating  the  fault  tree  using  the  failure  distribution  function 
for  each  component  as  given  by  Fcj  j,(t).  Of  course,  if  our  only  interest  is  in  the  failure  probability  for  the 
entire  mission,  we  evaluate  the  fault  tree  assigning  a  constant  faliure  probability 


•SI 


to  component  j  . 


4.2  Age-Dependent  Failure  Rates 

If  the  failure  rates  of  components  are  phase  and  age  dependent  then  we  cannot  count  time  for  each  phase 
independently.  Instead,  to  compute  the  failure  probability  distribution,  we  have  to  account  for  the  global 
(mission)  time  and  its  affect  on  each  component.  This  can  be  achieved  by  assigning  the  failure  distribution 
function  for  component  j  in  phase  k  as  follows. 


Eii-i  rCT, 
i«t  JcTj. 


CT,.,  >•''  >  (1  _  g  Jc 


)• 


Here, 

i 

CTi  =  ^T, 
1=1 


is  the  sum  of  durations  for  i  phases  and  CTo  =  0.  The  time  t  is  the  cumulative  time  and  is  not  reset  to  zero 
for  the  next  phase.  Instead  it  starts  at  f  =  0  at  the  beginning  of  a  mission  and  continues  to  increase.  With 
this  modification,  the  fault  tree  can  be  evaluated  for  any  time  0  <  f  <  CTm-  The  probability  of  failure  of 
component  Cj  at  the  end  of  the  mission  is  given  by 


1  —  c 


CTi-l 


Using  this  constant  failure  probability  for  component  Cj  (for  all  j),  the  fault  tree  can  be  evaluated  to 
obtain  the  mission  failure  probability. 

4.3  Random  Phase  Durations 

To  eiccount  for  random  phase  durations,  we  use  conditioning  followed  by  the  theorem  of  total  probability. 
Let  FTiHi)  be  the  distribution  function  for  the  length  of  phase  t.  These  distributions  are  specified  by  the 
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user.  Cooditioning  on  the  durations  of  phases  Ti  =  ti,Ti  =  t],  ■  ■  ■  ,Tm  =  tm  the  mission  failure  probability 
for  component  j  is  given  by 

1 

Then  the  unconditional  failure  probability  for  component  j  is  given  by 

/ / . /[I  -  c- . dFrM  =  1  -  HZiFtMh) 

where  F<f'(s)  is  the  LST  (Laplace  Stieltejs  transform)  of  7<  so  that  Ff'(s)  =  e~‘*'dFT,iti) 

This  failure  probability  can  be  assigned  to  component  Cj  (for  all  j)  and  the  fault  tree  cw  be  evaluated 
to  compute  the  unreliability  of  the  system  for  the  whole  mission  consisting  of  m  phases. 


5  Phased-Mission  Analysis:  Phase-Dependent  Success  Criteria 

The  results  of  the  previous  section  apply  to  the  cases  when  the  success  criteria  does  not  change  from  phase 
to  phase.  However,  in  many  applications,  the  success  criteria  and  the  system  configuration  may  change  from 
phase  to  phase.  There  are  several  reasons  for  reconfiguration  and  change  in  success  criteria  from  phase  to 
phase.  Some  of  these  are  discussed  below. 

1 .  A  component  is  used  in  all  phases  but  its  operationid  level  requirements  may  change.  In  this  case,  no 
special  treatment  is  required  for  this  component.  The  definition  of  operation  or  failed  state  depends 
on  the  success  criteria. 

2.  A  component  is  used  in  a  n  consecutive  phases  starting  with  some  phase  k,  and  is  then  not  needed  for 
system  operation  in  the  remaining  phases. 

3.  A  component  is  required  to  remain  operational  for  some  phase,  is  not  need  for  the  operation  of  a  few 
phases  and  is  then  required  again  for  system  operation. 

4.  Additional  redundant  modules  are  added  during  the  operation  of  the  system. 

5.  Some  redundant  modules  are  removed  from  a  subsystem. 

6.  Spare  or  operational  redundant  modules  corresponding  to  one  subsystem  become  spare  or  redundant 
modules  for  another  subsystem. 

Due  to  a  change  in  success  criterion,  it  is  possible  that  some  combination  of  failures  of  components  in 
one  phase  leads  to  failure  of  the  system  whereas  the  same  combination  does  not  lead  to  failure  in  some  other 
phase.  In  Markov  chain-based  methods,  it  is  easier  to  keep  track  of  the  system  states,  and  therefore,  change 
in  system  success  criteria  could  be  easily  accounted  for.  However,  in  the  case  of  a  fault  tree,  this  change 
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needs  to  be  accounted  for  by  considering  cases  when  the  system  may  fail  or  may  not  fail  at  the  time  of  phase 
transition.  There  are  four  possible  cases  which  may  occur  at  the  time  of  a  phase  transition  from  phase  i  to 
phase  t  +  1  ■ 

1.  A  combination  of  component  failures  does  not  lead  to  system  failure  in  both  phases  t  and  t  +  1. 

2.  A  combination  of  component  failures  leads  to  system  failure  in  both  phases  t  and  t  +  1- 

3.  A  combination  of  component  failures  does  not  imply  system  failure  in  phase  i  but  is  treated  as  system 
failure  in  phase  t  +  1 . 

4.  A  combination  of  component  failures  implies  system  failure  in  phase  t  but  does  not  imply  system  failure 
in  phase  t  +  1  ■ 

The  first  two  cases  require  treatment  similar  to  that  in  the  previous  section  as  the  success  criteria  does 
not  change  from  phase  i  to  phase  t  +  1  with  respect  to  the  failure  combination  under  consideration.  Failure 
combinations  in  the  third  case  above  should  be  treated  as  failures  in  the  earlier  phase  t  as  well.  This  is 
because  such  combinations,  once  present  during  a  phase  are  bound  to  lead  to  the  system  failure  eventually 
at  the  transition  time  when  the  systems  enters  this  later  phase.  These  are  referred  to  as  latent  failures  in 
[11].  Hence  a  more  stringent  criterion  should  be  applied  with  respect  to  these  combinations.  So  we  can 
assume  that  ail  failure  combinations  in  phase  t  +  1  are  also  failure  combinations  in  phase  t  (but  not  vice 
versa).  Hence  for  the  first  three  cases,  the  unreliability  can  be  evaluated  by  evaluating  the  fault  tree  for  the 
last  phase  using  the  approach  of  Section  4. 

The  failure  combinations  which  imply  system  failure  in  phase  t,  but  do  not  lead  to  system  failure  in 
subsequent  phases,  as  is  the  fourth  case,  should  be  handled  more  carefully.  We  need  to  account  for  the 
probability  of  occurrence  of  these  failure  combinations  until  phase  i.  Any  probability  attributed  to  such 
combinations  of  component  failures  in  later  phases  does  not  contribute  towards  system  unreliability.  E^ary 
and  Ziehms  account  for  this  by  cascading  the  phase  reliability  blocks.  However,  as  mentioned  earlier,  that 
leads  to  a  more  expensive  computation.  We  nresent  our  method  of  handling  such  failure  combinations  below. 

Our  methodology  consists  of  the  following  steps.  We  divide  the  system  unreliability  of  a  phased  mission 
system  into  two  parts:  (i)  common  failure  combinations;  and  (ii)  phase  failure  combinations.  We  evaluate 
the  unreliability  due  to  these  two  components  using  the  following  procedure. 

5.1  Common  Failure  Combinations 

The  first  component,  common  failure  combinations,  includes  the  probability  of  those  component  failure 
combinations  which  are  common  to  all  phases  after  the  most  stringent  criterion  has  been  applied  to  all  phases. 
That  is,  if  a  combination  leads  to  system  failure  in  phase  t  + 1,  then  it  is  a  considered  a  failure  combination  in 
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phase  i  as  well.  Thus  the  common  failure  combinations  essentially  include  the  failure  combination  specified 
for  the  last  phase. 

The  unreliability  due  to  common  failure  combinations  can  be  computed  using  the  method  described  in 
the  previous  section  for  analyzing  phased-mission  system  with  phase-independent  success  criteria.  That  is, 
we  compute  the  failure  probability  distribution  for  individual  component  and  then  evaluate  the  common 
fault  tree  which  is  the  fault  tree  for  the  last  phase. 

5.2  Phase  Failure  Combinations 

The  second  component,  phase  failure  combination,  includes  the  probability  of  all  failures  specific  to  individual 
phases  after  applying  the  most  stringent  success  criterion  in  each  phase.  For  phase  i,  this  part  include  the 
probability  of  only  those  component  failure  combinations  which  contribute  to  system  failure  in  phase  t  but 
are  considered  operational  in  all  subsequent  phases. 

Unreliability  due  to  the  second  component  requires  additional  computations.  For  each  phase,  we  need  to 
identify  and  compute  the  probability  of  component  failure  combinations  which  lead  to  system  failure  in  that 
phase  and  does  not  imply  system  failure  in  any  subsequent  phase.  Let  Ei  be  the  Boolean  logic  expression 
specifying  the  failure  combinations  for  phase  t.  Then  phase  failure  combinations  for  phase  i  (PFCi),  which 
are  treated  as  success  combinations  for  the  all  subsequent  phases  are  given  by 

=  (•••((£:<  A  A  A 

In  the  above  expression,  we  include  only  those  combinations  which  are  failure  combinations  in  phase  t  but 
are  not  failure  combinations  in  any  of  the  subsequent  phases.  This  expression  can  be  simplified  as 

PFCi  =  Ei  A(Ei+i  V  -  V£p). 


5.3  Phase  Algebra 

Let  y4  =  1  mean  that  component  A  has  failed.  Then  =  0  says  that  component  A  has  failed  and  A  =  I 
means  that  component  A  is  operational.  Using  this  notation,  for  the  system  described  in  Figure  2  the 
following  Boolean  expression  describe  the  failure  combinations  for  phases  X,  Y,  and  Z. 

Ex  —  A  +  B  +  C 

Ey=J  +  'BT^ 

Ez  =  ABC 

It  should  be  noted  that  in  the  expression  for  PFCi,  event  A  denotes  the  failure  of  component  A  in  phase 
i  only.  Thus  for  each  phase,  we  need  to  define  a  separate  symbol  for  each  component.  This  is  very  similar 
to  E^ry  and  Ziehms  notation  where  they  have  a  sepuate  symbol  denoting  failure  of  a  component  in  each 
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phase.  Let  y4i  =  1  denote  the  event  that  component  A  is  operational  during  the  interval  from  the  start  of 
the  mission  until  the  end  of  phase  t.  This  automatically  implies  that  the  component  is  operational  during 
all  earlier  phases  as  well.  Wit  i  this  addition,  the  Boolean  expressions  for  phases  X,  Y,  and  Z  used  in  system 
phase  i  are  denoted  by  Eix,  Eiy,  and  Eix,  respectively,  and  are  given  by  the  following. 

Eix  —  Ai  +  Bi+  Ci 

EiY  —  +  Bi  Ci 

Eiz  =  Ai  Bi  Ci 

When  the  expression  for  PFCi  is  simplified,  we  need  to  merge  different  combinations  of  such  terms  which 
could  be  a  little  tricky  and  need  special  treatment.  Let  i  and  j  be  two  phases  and  let  i  <  j.  The  following 
rules  should  be  used  to  simplify  the  logic  expressions. 


Ai  Aj 

-  Aj 

Ai  +  Aj 

-  Ai 

Ai  Aj 

—  Ti 

Ai  +  Aj 

^  Ai 

(2) 

TiAj 

-  0 

Ai  +  Aj 

—  1 

Ai^  and  Jii  +  Aj  do  not  simplify  any  further.  What  the  first  combination  means  is  that  component  A  is 
operational  until  the  end  of  phase  t  and  then  fails  sometime  between  the  end  of  phase  t  and  end  of  phase  j. 
The  second  term  has  no  physical  meaning.  Also,  if  a  component  fails  during  a  phase  and  then  it  is  required 
to  be  operational  during  a  later  phase,  then  the  two  events  cannot  be  satisfied  at  the  same  time.  That  is 
why  AiAj  — ►  0  holds. 

The  correctness  of  these  relations  can  be  verified  by  considering  the  following.  Let  a,  =  1  denote  that 
the  component  A  is  operational  during  phase  i  only.  Then  A,  =  oioj  •  •  a,-  and  Aj  =  0102  •  ■  Oj.  Now  by 
substituting  these  values  on  both  sides  of  each  of  these  relations,  we  can  verify  that  Relations  2  hold. 

5.4  System  Unreliability 

Using  the  phase  algebra,  the  system  unreliability  can  be  computed  as  follows.  First  compute  all  the  PFCi's 
for  all  phases.  Then  the  system  unreliability  is  given  by 

p-i 

UJ?  =  P(£7p)  +  53P(PFC'0  (3) 

•=i 

where  P(Fp)  is  the  probability  of  failure  evaluated  using  the  fault  tree,  Ep  of  phase  p  (the  last  phase)  using 
the  failure  distribution  function  calculated  for  each  component  as  described  in  Section  3.  P(PPCi)  is  the 
probability  of  phase  failure  combinations  for  phase  ».  To  calculate  PFCi’s,  we  will  require  probability  of 
events  such  as  a  component  remains  operational  during  all  phases  starting  from  1  to  t,  or  a  component 
remains  operational  during  phase  1  to  phase  k  and  then  fails  during  phase  +  1  to  phase  i  for  some  k.  Such 
probabilities  can  also  be  calculated  using  the  techniques  defined  in  Section  3. 
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5.5  Example 

In  this  section,  we  demonstrate  our  technique  using  the  example  described  in  Figure  2.  This  system  has 
three  components  and  we  describe  three  phases,  X,  Y,  and  Z.  To  show  the  difference,  we  will  consider  all 
the  six  permutations  of  three  phases.  The  failure  combinations  of  three  phases  are  defined  by  Ex ,  Ey ,  and 
Ez  above. 

Now  we  discuss  each  of  the  six  permutations  separately. 


Permutation  X  Y  Z.  In  this  case  first  phase  is  phase  X,  followed  by  phase  Y,  that  is  followed  by  phase 
Z.  So  tbe  PFCi  functions  are  obtained  as  follows. 


PFCi  =  (Eix  ■  Ew)  •  Eaz 

•  =  ((j4i  +  Bi  +  Ci).(j42  +  Bj  C2)).(i43  B3  C3) 

=  J43B2C1  +  AzBiCi  +  .^2^3^^!  +  A^B\Cz 

(4) 

PFC2  —  Ey.Ez 

—  (.A2  +  B2  C2)-(A3  B3  C3) 

=  A3B2  C2  +  A2B3  +  i42C3 

Then  the  system  unreliability  is  given  by 

URxyz  =  PiEsz)  +  PiPFCi)  +  P{PFC2) 
where 

P{E3z)  =  P(M)  ■  P(b;)  ■  Pi^ 

P{PFCi)  =  P{A3B2Ci  +  A3B1C2  A2B3C1  +  742B1C3) 

=  P{A3B2C\)  +  P((j43BiC2  +  A2B3Ci  +  A2B1C3')  •  (.^3  +  B2  +  C^l)) 

=  P{A3B2Ci)  +  P{A3BiC2  +  A2A3B3C1  +  A2B1C3) 

=  P{A3B2C~i)  +  P{A3BIC2)  +  Pi{A2JiB3'Ul  +  A2BIC3)  ■(M  +  Bi+  C^)) 

=  P{A3B2C^)  +  P{A3'^C2)  +  P{A2MB3CI  +  A2M  '^Cs) 

=  P{A3B2CI)  +  P{A3B[C2)  +  P{A2AlB3Cl)  +  P((>l2^  57^3)  ■(M  +  A3+B;+Ci)) 

=  P{A3B2CI)  +  P(>l3B7C2)  +  P{A2MB3^\)  +  P{A2M  PTCs) 

and 

P{PFC2)  —  P{A3B2  C2  +  A2B3  +  i42C3) 

=  P{A3B2  C2)  +  P((j42B3  +  .42^3)  •  (>43  +  B2  +  C2)) 

=  P(A3B2  C2)  +  P(.'42B3  +  A2O3) 

=  p(A3^  +  p(a7b3)  +  p{(a;c3)  ■  {A2  +  b7)) 

=  P{A3B;  ci)  +  P(MB3)  +  P(^B3C3) 

(5) 
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It  is  easy  to  compute  the  probability  of  failure  in  phase  3  using  the  failure  distributions  for  individual 
components.  A  fault  tree  solver  such  as  SHARPE  [2]  can  be  used  to  compute  that.  Similarly,  the  probability 
of  expressions  in  Equation  4  can  be  evaluated  after  simplifying  the  expressions  as  a  sum  of  disjoint  products 
using  algorithm  such  as  the  one  described  in  [12]  and  depicted  in  5. 


Permutation  X  Z  Y.  In  this  case  first  phase  is  phase  X,  followed  by  phase  Z,  that  is  followed  by  phase 
Y.  Without  going  into  details,  the  PFCi  functions  are  computed  as  follows. 


PFC\  =  A3H3C1  + 


and 


PFCi  =  ^ 

The  last  phase  in  this  case  is  phase  Y.  The  system  unreliability  can  be  computed  using 

URxzy  =  P{Ezy)  +  P(PFCx)  +  PiPFCi) 

=  P(A;)  +  PiAsS;  Ci)  +  PiAaBaU^)  +  PiAsB^Ca). 

Permutation  Y  X  Z.  For  this  case,  the  PFCi  functions  are  computed  as  follows. 

PFCt  =  <!> 


and 


PFCi  —  Aa(Bi  +  Ci)  +  B3{Ai  +  Ci)  +  Ca{Ai  +  Bi) 


The  last  phase  in  this  case  is  phase  Z.  The  system  unreliability  can  be  computed  using  the  following.  (We 
are  omitting  details  of  simplification.) 

URyxz  =  P{Eaz)  +  P{PFCx)  +  P{PFCi) 

=  P(M)  ■  P(B^  ■  P(^  +  PiAaB^)  +  P{AaBiU;)  +  PiMBa) 

+P{Ai  BaCa)  +  P{AiAaB3Ci)  +  ^(^2^3  BiCa) 

Permutation  Y  Z  X.  For  this  case,  the  PFCi  functions  are  computed  as  follows. 

PFCi  =  4> 

and 

PFCi  =  <t> 

The  last  phase  in  this  case  is  phase  X.  The  system  unreliability  can  be  computed  using  the  following. 

URyzx  =  P{Eax)  +  P{PFCi)  +  P{PFCi) 

=  P(A;)  +  P(%)  +  P(^) 
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Permutatioii  Z  X  Y.  For  this  esse,  the  PFCi  functions  are  conaputed  as  foUows. 

PFCi  =  4 


and 

PFCi  —  +  A3B3C3 

The  last  phase  in  this  case  is  phase  Y.  The  system  unreliability  can  be  computed  using  the  following. 

[/Rzxy  =  P(B3y)  +  P(PFCt)  +  P(PFCj) 

=  +  PiAiSi  +  P(AsBs^)  +  PCAsBjCa) 

Pennutation  Z  Y  X.  For  this  case,  the  PFCi  functions  are  computed  as  follows. 

PFCt  =  4 
and 

PFCi  =  4 

The  last  phase  in  this  case  is  phase  X.  The  system  unreliability  can  be  computed  using  the  following. 

URyzx  =  PiEax)  +  PiPFCi)  +  P(PPC,) 

=  P(:i;)  +  P(l^  +  P(^ 

5.6  Exact  Solution  Using  Markov  Chain 

The  same  three  component  system  can  be  analyzed  using  Markov  Chain  for  the  six  permutations.  There 
are  eight  possible  states  in  each  phase  as  depicted  in  Figure  3.  Using  the  same  notation  for  the  names  of 
states,  i.e.,  state  101  represents  that  components  A  wd  C  are  operational  and  component  B  has  failed,  we 
can  derive  expressions  for  states  occupacncy  probabilities  (SOPs)  at  the  end  of  each  phase.  Depending  on 
the  success  criteria,  for  the  failure  states  in  phase  p,  the  initial  state  occupancy  probability  for  the  same 
state  in  phase  p  -t- 1  is  zero. 

Let  Pp(f)  denote  the  SOP  for  phase  p  of  state  s  where  s  €  {000,001,010,011,100,101,110,111}  and 
p  =  1,  2,  and  3.  Again,  let  Tp  denote  the  phase  duration  for  phase  p  and  let  CTp  denote  the  sum  of 
durations  of  first  p  phases.  Let  Xa,,  denote  the  failure  rates  of  components  A,  B,  and  C, 

respectively,  in  phase  p.  Using  these  notations,  the  SOPs  for  phase  p  can  be  derived  using  the  SOPs  for 
phase  p  —  1  and  are  given  in  Equation  6. 

Using  the  relationship  in  Ekiuation  6,  we  can  compute  the  SOPs  for  operational  states  for  eetch  phase. 
The  unreliability  at  the  end  of  each  phase  is  given  by  1  -  sum  of  SOPs  of  operational  states  in  that  phase. 
At  the  end  of  that  phase,  SOP  for  the  failure  states  in  that  phase  can  be  set  to  zero  as  this  probability  mass 
is  not  carried  forward  to  the  next  phase  to  success  states.  For  example,  for  the  case  of  permutation  X  Y  Z, 
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initially  Po(,)  =  0.0  for  all  states  where  s  ^  1 1 1  and  Po(iii)  =  l  O-  Using  these  values  and  the  success  criteria 
for  phase  X,  at  the  end  of  phase  X,  we  assign  Pi(,)(CTi)  =  0.0  for  all  states  where  s  ^  1 11  and  Pnnx)iCTr) 
is  calculated  Equation  6.  Using,  these  values  and  the  success  criteria  of  phase  Y,  we  can  compute  SOPs  for 
phase  2.  At  the  beginning  of  phase  3,  we  assign  Pj(,)(CT2)  =  0.0  where  s  €  {000,001,010,011, 100}  and 
compute  P^,)(CT2)  =  0.0  where  s  €  {101,110,111}  using  relations  defined  in  Equation  6.  Finally,  using 
these  results  of  phase  2,  we  can  calculate  P3(,)(CT3)  where  s  €  {001,010,011, 100, 101, 110,  111}. 

Sometimes  a  backward  or  need-based  computation  may  be  more  useful.  For  example,  for  permutation 
Z  Y  X,  we  only  need  to  calculate  P3(ni)(CT3)  which  requires  only  P3(iii)(CTa).  This,  in  turn,  requires 
computation  of  Pi(ui)(CTi)  which  can  be  calculated  using  Po(iii)(CTo)  =  10.  Finally,  the  unreliability  for 
the  3  phase  system  is  1  —  P3(iii)(CT3).  However,  intermediate  unreliabiities  at  the  end  of  phases  1  and  2 
may  require  more  computation. 


Pp(Ul)((^Pp-i  +  0 
Pp(ll0)((^Pp-t  +  0 

Pp(ioi)(CTp-i  + 1) 
Pp(on)(CTp_i  -I- 1) 
Pp(ioo){CTp-\  + 1) 

Pp(oio){CTp-i  +t) 
Pp(ooi)iCTp-i  -b  t) 


=  Pp-i(xu)(CTp.i)e-^'^>* 

=  Pp-,(„i)(CTp_i)e-*>'»‘  (1  -  +  Pp-,(„o)(CTp_,)e-'>‘r* 

=  Pp-,(ui)(CTp_t)c-*^r‘  (1  -  -b  Pp-,(,oj)(CTp_,)e-*^r* 

=  Pp-l(Ul)(CTp_,)(l  -  c-*^r‘)  e-*V  +  Pp_,(on)(CTp-,)c-**»‘ 

=  Pp-i(ui)(CTp_,)e-*>‘r‘  (1  -  e-*V)  (1  -  +  Pp_,(,oo)(CTp-,)e-*^»‘ 

+Pp-i(no)(CTp_,)e-*>‘r‘  (1  -  +  Pp_,(ioi)(CTp_,)c-*-r‘  (1  - 

=  Pp-i(iu)(C'Tp-,)(l  -e-*^r‘)  (1  -  +  Pp.,(o,o)(Crp.,)e-^-r‘ 

+Pp-i(uo)(CTp_i)(l  -  +  Pp_,(on)(CTp_,)e-*‘'r‘  (1  - 

=  Pp-i(iii)(CTp-,)(l  -  c-*-r‘)  (1  -  e-*‘^r‘  -b  Pp-,(ooi)(C'Tp-i)e-''‘^r‘ 

+Pp-i(ioi)(CTp-i)(l  -  +  Pp_,(oii)(CTp_,)(l  -  «-*-»*)  e-'*^** 


5.7  Comparison  with  Other  Techniques 

We  analyze  the  above  six  scenarios  using  the  technique  discussed  in  this  p^er,  Esary  and  Ziehms  approach, 
analytic  solution  of  Markov  chains,  phased-mission  approach  of  [10]  and  [9],  and  the  phased-mission  ^proach 
of  [8].  We  assume  that  the  durations  of  all  the  three  phases  are  10  hours  each  and  the  failure  rate  of  each  of 
the  component  is  0.0001/hour.  Thus  the  input  data  do  not  skew  results  in  any  direction  as  all  components 
are  similar  and  all  phases  are  similar.  The  results  are  only  affected  by  the  sequencing  of  phases  and  system 
success  criteria. 

We  obtain  the  results  shown  in  Tables  1  and  2.  The  results  for  the  six  permutations  of  phases  X,  Y,  and 
Z,  are  obtained  (and  listed)  at  the  end  of  each  phase.  When  the  worst  case  criteria  is  applied,  that  is  a  failed 
state  in  one  phase  is  considered  as  failed  state  in  all  subsequent  phases,  the  results  for  unreliability  can  be 
very  high. 
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Table  1:  Unreliability  of  Phaaed-Mission  System  (Accurate  Analysis) 


Permute 

Y  X  Z 

EBQH 

Z  Y  X 

Phase  1 

0.002995504 

0.002995504 

0.001000498 

0.001000498 

0.000000001 

0.000000001 

Phase  2 

0.003993006 

0.002995505 

0.005982036 

0.001000502 

0.005982036 

0.002001985 

Phase  3 

0.003993009 

0.004991493 

0.005982037 

0.008959621 

0.006976549 

0.008959621 

Table  2:  Unreliability  of  Phased-Mlssion  System  (Worst  Case  Scenario) 


Permute 

X  Y  Z 

X  Z  Y 

Y  Z  X 

Z  X  Y 

Phase  1 

0.002995504 

6.002995504 

0.001000498 

0.001000498 

0.000000001 

0.000000001 

Phase  2 

0.005982035 

0.005982036 

0.005982036 

0.002001985 

0.005982036 

0.002001985 

Phase  3 

0.008959621 

0.008959621 

0.008959621 

0.008959621 

0.008959621 

0.008959621 

The  important  thing  to  observe  here  is  that  when  we  allow  failure  combinations  (failure  states  in  Markov 
chains)  to  become  operational  combinations  (up  states  in  Markov  chains)  in  a  later  phase,  then  the  overall 
unreliability  of  the  system  can  be  substantially  lower,  as  is  the  case  in  the  last  column.  For  example,  in 
a  spacecraft,  launch  is  the  most  important  activity.  After  that,  all  launch  related  activities  or  components 
which  could  have  caused  failure  during  launch  is  not  going  to  make  any  difference  any  more.  Thus  those 
failure  combinations  are  operational  combinations  for  the  rest  of  the  mission. 

To  further  explore  the  impact  of  phase  configurations  and  durations  of  phases,  we  varied  the  phase 
durations.  In  the  first  variation,  we  assume  that  the  first  phase  is  always  of  1  hour  duration,  the  second 
phase  is  of  10  hour  duration,  and  the  third  phase  is  of  100  hour  duration  irrespective  of  the  types  of  phase 
configurations,  X,  Y,  or  Z,  used  during  these  phases.  The  results  for  this  variation  for  the  two  cases  are  shown 
in  Tables  3  and  4,  respectively.  In  another  variation,  we  assume  that  the  phase  X  is  always  of  1  hour  duration, 
phase  Y  is  always  of  10  hours  duration,  and  phase  Z  is  always  of  100  hours  duration  irrespective  of  where  in 
the  mission  these  phase  confieurations  are  used.  The  results  are  given  in  Tables  5  and  6,  respectively.  In  this 
case,  the  results  differ  by  more  than  an  order  of  magnitude  depending  on  the  ordering  of  the  phases  f  the 
stringest  success  criteria  is  during  the  beginning  of  phEises,  then  phased-mission  analysis  is  more  meaningful. 

It  should  be  noted  that  the  techniques  in  [10],  [8],  and  [9]  are  capable  of  handling  the  more  general  case 
of  repairable  systems  while  the  technique  discussed  by  Esary  and  Ziehms  as  well  as  the  one  presented  in  this 
paper  are  both  restricted  to  the  cases  of  non-repairable  systems.  The  technique  in  [9]  is  most  general  but 
most  expensive  in  comput  t'on  time  and  in  this  case  will  yield  the  same  result  as  in  [10]  because  both  of 
these  make  no  approximations. 


16 


Table  3:  Unreliability  with  1,  10,  and  100  hours  phases  (Accurate  Analysis) 


Permute 

X  Z  Y 

m 

DEQHI 

EEUB 

Phase  1 

0.000299955 

0.000299955 

0.000100005 

0.000100005 

0.000000000 

0.000000000 

Phase  2 

0.001300153 

0.000299956 

0.003294561 

0.000100006 

0.003294561 

0.001100603 

Phase  3 

0.001301332 

0.011354728 

0.003295543 

_ 

0.032751658 

0.013309644 

0.032751658 

Table  4:  Unreliability  with  1,  10,  and  100  hours  phases  (Worst  Case  Scenario) 


Permute 

Y  X  Z 

Y  Z  X 

BSEIH 

Phase  1 

0.000299955 

0.000299955 

0.000100005 

0.000100005 

0.000000000 

0.000000000 

Phase  2 

0.003294561 

0.003294561 

0.003294561 

0.000200020 

0.003294561 

0.001100603 

Phase  3 

0.032751658 

0.032751658 

0.032751658 

0.032751658 

0.032751658 

0.032751658 

6  Conclusions 

We  have  presented  a  technique  to  analyze  phased-mission  systems  using  fault  trees.  This  technique  yields 
accurate  results  and  is  simpler  in  concept  and  computation.  For  this  purpose,  we  develop  a  phase  algebra 
that  allows  us  to  efficiently  compute  the  probability  of  ail  possible  combinations  contributing  to  failure  in 
phased-mission  systems  during  individual  phases.  This  technique  will  be  very  useful  for  a  large  class  of 
systems  where  the  system  behavior  can  be  described  using  fault  trees. 
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Permute 

qqqIH 

m 

QQBI 
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ED9HI 
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0.001000498 
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0.003294561 

0.003294561 

0.003294561 

0.011058089 

0.029845556 
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0.032751658 
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